29 research outputs found
SOM-based algorithms for qualitative variables
It is well known that the SOM algorithm achieves a clustering of data which
can be interpreted as an extension of Principal Component Analysis, because of
its topology-preserving property. But the SOM algorithm can only process
real-valued data. In previous papers, we have proposed several methods based on
the SOM algorithm to analyze categorical data, which is the case in survey
data. In this paper, we present these methods in a unified manner. The first
one (Kohonen Multiple Correspondence Analysis, KMCA) deals only with the
modalities, while the two others (Kohonen Multiple Correspondence Analysis with
individuals, KMCA\_ind, Kohonen algorithm on DISJonctive table, KDISJ) can take
into account the individuals, and the modalities simultaneously.Comment: Special Issue apr\`{e}s WSOM 03 \`{a} Kitakiush
Dynamical Equilibrium, trajectories study in an economical system. The case of the labor market
The paper deals with the study of labor market dynamics, and aims to
characterize its equilibriums and possible trajectories. The theoretical
background is the theory of the segmented labor market. The main idea is that
this theory is well adapted to interpret the observed trajectories, due to the
heterogeneity of the work situations.Comment: accepted to the WSOM 2007 Conference (Bielefield
Missing values : processing with the Kohonen algorithm
Publié sur le site de la conférence amsda2005.eznst-bretagne.fr pages 489-496International audienceThe processing of data which contain missing values is a complicated and always awkward problem, when the data come from real-world contexts. In applications, we are very often in front of observations for which all the values are not available, and this can occur for many reasons: typing errors, fields left unanswered in surveys, etc. Most of the statistical software (as SAS for example) simply suppresses incomplete observations. It has no practical consequence when the data are very numerous. But if the number of remaining data is too small, it can remove all significance to the results. To avoid suppressing data in that way, it is possible to replace a missing value with the mean value of the corresponding variable, but this approximation can be very bad when the variable has a large variance. So it is very worthwhile seeing that the Kohonen algorithm (as well as the Forgy algorithm) perfectly deals with data with missing values, without having to estimate them beforehand. We are particularly interested in the Kohonen algorithm for its visualization properties
Using working patterns as a basis for differentiating part-time employment
ACSEG 2002, publié EJESSSeeking to determine which working patterns have a specific effect on part-time work, in 1998-99 France's INSEE statistical agency carried out a Timetable survey that questioned the homogeneity of this form of employment (again in terms of the working patterns upon which it is based). A neuronal method was used to classify an entire sample of part-time employees according to their weekly working patterns –the end result being that part-time work was shown to be a very heterogeneous form of employment. This was not only reflected by the existence of many different groups of part-time employees, each with highly differentiated individual and professional characteristics, but also (and above all) by the diversity of their weekly working patterns
Dynamical Equilibrium, trajectories study in an economical system. The case of the labor market.
The paper deals with the study of labor market dynamics, and aims to characterize its equilibriums and possible trajectories. The theoretical background is the theory of the segmented labor market. The main idea is that this theory is well adapted to interpret the observed trajectories, due to the heterogeneity of the work situations.Segmented labor market; Kohonen maps; trajectories
Consumer Profile Identification and Allocation
We propose an easy-to-use methodology to allocate one of the groups which
have been previously built from a complete learning data base, to new
individuals. The learning data base contains continuous and categorical
variables for each individual. The groups (clusters) are built by using only
the continuous variables and described with the help of the categorical ones.
For the new individuals, only the categorical variables are available, and it
is necessary to define a model which computes the probabilities to belong to
each of the clusters, by using only the categorical variables. Then this model
provides a decision rule to assign the new individuals and gives an efficient
tool to decision-makers. This tool is shown to be very efficient for customers
allocation in consumer clusters for marketing purposes, for example.Comment: Accepted in the IWANN 07 conference San Sebastian, June 2007
How to improve robustness in Kohonen maps and display additional information in Factorial Analysis: application to text mining
This article is an extended version of a paper presented in the WSOM'2012
conference [1]. We display a combination of factorial projections, SOM
algorithm and graph techniques applied to a text mining problem. The corpus
contains 8 medieval manuscripts which were used to teach arithmetic techniques
to merchants. Among the techniques for Data Analysis, those used for
Lexicometry (such as Factorial Analysis) highlight the discrepancies between
manuscripts. The reason for this is that they focus on the deviation from the
independence between words and manuscripts. Still, we also want to discover and
characterize the common vocabulary among the whole corpus. Using the properties
of stochastic Kohonen maps, which define neighborhood between inputs in a
non-deterministic way, we highlight the words which seem to play a special role
in the vocabulary. We call them fickle and use them to improve both Kohonen map
robustness and significance of FCA visualization. Finally we use graph
algorithmic to exploit this fickleness for classification of words
Cartes auto-organisées pour l'analyse exploratoire de données et la visualisation
Article de synthèse sur les applications de l'algorithme de Kohonen pour la visualisation et l'analyse de donnéesThis paper shows how to use the Kohonen algorithm to represent multidimensional data, by exploiting the self-organizing property. It is possible to get such maps as well for quantitative variables as for qualitative ones, or for a mixing of both. The contents of the paper come from various works by SAMOS-MATISSE members, in particular by E. de Bodt, B. Girard, P. Letrémy, S. Ibbou, P. Rousset. Most of the examples have been studied with the computation routines written by Patrick Letrémy, with the language IML-SAS, which are available on the WEB page http://samos.univ-paris1.fr